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Abstract. In a recent paper [8] a new approach — called the "em- 
bedding method" — was introduced, which allows to make use of ex- 
changeable pairs for normal and multivariate normal approximation with 
Stein's method in cases where the corresponding couplings do not satisfy 
a certain linearity condition. The key idea is to embed the problem into 
a higher dimensional space in such a way that the linearity condition is 
then satisfied. Here we apply the embedding to U-statistics as well as 
to subgraph counts in random graphs. 

1. Introduction 

Stein's method, first introduced in the 70s has proven a powerful tool 
for assessing distributional distances, such as to the normal distribution, in 
the presence of dependence. When considering sums W of random variables, 
the dependence between these random variables needs to be weak in order 
for the distance to a normal distribution to be small. For quantifying weak 
dependence, Stein [T2] introduced the method of exchangeable pairs: con- 
struct a sum W such that {W, W) form an exchangeable pair, and such that 
]E^(VF' — W) is (at least approximately) linear in W. This linearity con- 
dition arises naturally when thinking of correlated bivariate normals. The 
generalisation of this approach to a multivariate setting remained untackled 
until recently Chatterjee and Meckes [2] solved the problem in the case of 
exchangeable vectors (W,VF') such that E^(T^' - W) = -XW + i? for a 
scalar A and a remainder vector R such that lE|i?| is small. This is a rather 
special case; the authors |8] tackled the general setting that 

(1.1) IS^iW -W) = -AW + R 

for a matrix A and a vector R with small lE|i?| is treated. In in a followup 
paper by Meckes [5] the results by Chatterjee and Meckes |2] and by the 
authors [8] are combined using slightly different smoothness conditions on 
test functions as compared to [8] ; non-smooth test functions are not treated 
by Meckes [5], but the bounds obtained there improve on those from [8j for 
the example of d-runs with respect to smooth test functions. 

A surprising finding in [8] was that it is often possible to embed a random 
vector W into a random vector W of larger, but still finite, dimension, 
such that (jl.ip holds with R = 0; yet this embedding does not correspond 
to Hoeffding projections. Here we explore the embedding method further, 
by illustrating its use on two important examples. The first example is 
complete non-degenerate U-statistics, and the second example considers the 
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joint count of edges and triangles in Bernoulli random graphs. In both 
examples the limiting covariance matrix is not of full rank; yet the bounds 
on the normal approximation are of the expected order. 

The paper is organised as follows. In Section [2] we review the theoretical 
results in [8J, giving bounds on the distance to normal under the linear- 
ity condition (jl.ip . both for smooth test functions and for non-smooth test 
functions. In Section [3] we discuss the embedding method, and point out 
a link to Rademacher integrals and chaos decompositions. Section U] illus- 
trates the embedding method for complete non-degenerate U-statistics; the 
embedding vector contains lower-order U-statistics which are obtained via 
fixing components. Section [5] gives a normal approximation for the joint 
counts of the number of edges and the number of triangles in a Bernoulli 
random graph; to our knowledge these are the first explicit bounds for this 
multivariate problem. The embedding method suggests to count the number 
of 2-stars as well, which makes the results not only more informative but 
also, surprisingly, easier to derive. 



2. Theoretical bounds for a multivariate normal 
approximation 

2.1. Notation. Denote by W = {Wi, W2, ■ ■ ■ , WdY random vectors in R'^, 
where Wi are R- values random variables for i = l,...,d. We denote by 
S symmetric, non-negative definite matrices, and hence by S^/^ the unique 
symmetric square root of S. Denote by Id the identity matrix, where we 
omit the dimension d. Throughout this article, Z denotes a random vari- 
able having standard d-dimensional multivariate normal distribution. We 
abbreviate the transpose of the inverse of a matrix A as A~* := (A""*^)*. 

For derivatives of smooth functions h : R, we use the notation V for 

the gradient operator. Denote by || • || the supremum norm for both functions 
and matrices. If the corresponding derivatives exist for some function h : 
IRf^ — > R, we abbreviate \h\i := supj||^/i||, \h\2 := sup^ j|| q^q^ . h\\, and so 
on. 

We start by considering smooth test functions. 

Theorem 2.1 (c.f. Theoem 2.1 ^8j). Assume that (W,W') is an exchange- 
able pair of M!^ -valued random variables such that 

(2.1) ISW = 0, MWW^ = E, 

with S G R'^^'^ symmetric and positive definite. Suppose further that (jl.ip 
is satisfied for an invertible matrix A and a a (W) -measurable random vari- 
able R. Then, if Z has d-dimensional standard normal distribution, we have 
for every three times differentiable function h, 

(2.2) \mh{W) - ^h{T}/^Z)\ ^ ^-^A + ^^B + + ld||S||V2|/j|2)c, 
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where, with = Yl'!n=i \ {^'^)m,i\, 
d 

A= A(^yVarE^(Wi' - Wi){W^ - Wj), 

d 

B= J2 )^^'^mwi-Wi){w;-w,){w;,-Wk)\, 

i,j,k=l 
i 

The proof of Theorem 12.11 is based on the Stein characterization of the 
normal distribution that Y G M!^ is a multivariate normal MVN(0, S) if and 
only if 

(2.3) Ey*V/(y) = EV*SV/(y), for all smooth /: R'' ^ R. 

In the paper by Meckes [5j a different norm for functions and for operators 
is used, to obtain a similar result, and the difference in the bounds depending 
on the chosen norm is illustrated for the example of runs on the line. 

Theorem 12.11 can be extended to allow for covariance matrices which are 
not full rank, using the triangle inequality in conjunction with the following 
proposition. 

Proposition 2.2 (c.f. Proposition 2.9 [8]). Let X and Y be W^-valued nor- 
mal variables with distributions X ~ MVN(0, S) and Y ~ MVN(0, Sq), 
where S = {Tij)ij=i,...,d has full rank, and Sq = {o'i j)ij=i^,,,^d is non- 
negative definite. Let h : M!^ IR, have 3 bounded derivatives. Then 

1 

\m{x) - E/i(y)| < -\h\2 - 

For non-smooth test functions, following Rinnot and Rotar [9], let $ de- 
note the standard normal distribution in R'^, and cf) the corresponding den- 
sity function. For h : R"' R set 

h]{x) = sup{/i(x + y) : \y\ ^ 6}, hj{x) = inf{/i(x + y) : \y\ ^ 6}, 

and h{x,6) = h'g{x) — hj{x). 

Let TL he a class of measurable functions R"^ R which are uniformly 
bounded by 1. Suppose that for any h ^ TL, for any 6 > 0, h'^{x) and hj{x) 
are in 7i; for any dx d matrix A and any vector b G R'^, h{Ax + b) ^Ti; and 
for some constant a = a(TC,S), sup^^g^ |/Rd h{x,6)^{dx)^ ^ a6. Obviously 
we may assume a > 1. The class of indicators of measurable convex sets is 
such a class where a ^ see the paper by Bolthausen and Gotze [Ij. 

Let W have mean vector and covariance matrix S. If A and R are 
such that ([LT]) is satisfied for W, then Y = Yr^l'^W satisfies ([Ll]) with 
A = and R! = ^-^/^R. With 

A« = X:i(S-i/^A-isV2)^,,| 

m=l 



as well as 
(2.4) 
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A' = ^ aW /VarE^^S-,^/^S-l/^(W^^ - W,){W', - W,), 

i,j y k,i 

(2.5) 



i,j,k r,s,t 

and 



label9C' = Y,X(\lls{Y,^;l/'R,) , 

k 



we have the following result [8]. 



Corollary 2.3. Let W be as in Theorem \2.1[ Then, for all h & TC with 
\h\ ^ 1, there exist 7 = 7(d) and a > 1 such that 

sup \m{W) - m{Z)\ ^ j^(^-D'log{T') + + C" + aVVj, 

with 



1 laB' . „, A' 



T' = -2 (^I)' + y — + D'^j and ^' = y + ^'d. 

Remark 2.4. We can simplify the above bound further. Using Minkowski's 
inequality we have that Var^^Lj^Xj ^ /c^ supj Var Xj, and thus obtain the 
simple estimate 

Var ^ll^'^jy\wl, - Wu){W[ - W,) 

k,e 

^ d^ \\^-^ \\^ supYaiE.^ UWl, - Wk){Wl> - WM 
k,e 

and hence 



A' ^ J^aW sup JVarE^{(W^- Wfc)(W;-l^^)}; 

i '^'^ 

in B' and C we could similarly bound by to obtain a simpler 

bound. There are however examples, such as the random graph example in 
Section [5l where HS""*^/^!! provides a non-informative bound. 

Remark 2.5. Note that, if (PF^, VF') is exchangeable and (jl.ip is satisfied 
we have 

(2.6) E(T^' - W){W' - Wy = 2]SW{AWY = 2SA* 
On the other hand, if we only have ^{W) = ^{W), we obtain 

(2.7) ]S{W' -W){W' -WY = A^ + ^A\ 

Hence, to check in an application whether the often tedious calculation of 
T, and A has been carried out correctly, we can combine Equations (j2.6p 
and (j2.7p . to conclude that, under the conditions of Theorem 12. H we must 
have AS = SA*. 
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3. The embedding method 

Assume that an ^-dimensional random vector W(^£'j of interest is given. 
Often, the construction of an exchangeable pair (W(^£),W^g^) is straight- 
forward. If, say, W(^£-j = W(^£-j(lL) is a function of i.i.d. random variables 
X = {Xi, . . . , Xn), one can choose uniformly an index / from 1 to n, replace 
Xj by an independent copy Xj, and define W^^-^ := Ty(£)(X'), where X' is 
now the vector X but with Xj replaced by Xj. 

In general there is no hope that iW(^£),W'^^^) will satisfy Condition (jl.ip 
with R being of the required smaller order or even equal to zero, so that in 
this case Theorem 12.11 would not yield useful bounds. 

Surprisingly often it is possible, though, to extend to a vector W S 
M!^ such that we can construct an exchangeable pair {W, W) which satisfies 
Condition (|l.ip with R = 0. If we can bound the distance of the distribution 
C{W) to a d-dimensional multivariate normal distribution, then a bound on 
the distance of the distribution C{W(^£-j) to an £— dimentional multivariate 
normal distribution follows immediately. 

In order to obtain useful bounds in Theorem 1 2. H the embedding dimen- 
sion d should not be too large. In the examples below it will be obvious how 
to choose W^^~^^ to make the construction work. 

As a first illustration of the method, it was observed in [6] that for func- 
tions which depend on the first d coordinates of an infinite Rademacher 
sequence, that is, a sequence of symmetric {—1,1} random variables, the 
natural embedding vector is a vector of Rademacher integrals of lower or- 
der. A similar construction works fairly generally, as follows. Assume that 
F = F(Xi, ...,Xii) is a random variable that depends uniquely on the first 
d coordinates of a sequence X of i.i.d. mean zero random variables, with 
E{F) = and E{F^) = 1, of the form 

d d 

(3.1) F = n\fn{k,...,in)X,,---X,„=:Yjn{fny, 

n=l l^ii<...<in^d n=l 

such representations occur as chaotic decompositions for functionals of Rade- 
macher sequences. A natural exchangeable pair construction is as follows. 
Pick an index / so that P{I = i) = 2 ^'-'^ ^ ~ 1, . . . ,d, independently of 
Xi, ...,Xd, and if J = i replace Xi by an independent copy X* in all sums 
in the decomposition ()3.ip which involve Xi. Call the resulting expression 
F' , and the corresponding sums Jl^{fn); n = 1, . . . ,d. Now choosing as em- 
bedding vector W = (Ji(/i), . . . , Jdifd)), we check that for all n = 1, . . . , d, 

EMfn) - JnifnW) 
1 

= E ^^„...M{i)n\Uh....,in)E{X,,■•■X,JW) 

i=l l^ii<. ..<in^ci 

Thus, with W' = (J{(/i), . . . , J^(/d)), the condition (jl.ip is satisfied, with 
A = (Aij)i^jj^rf being zero off the diagonal and Xn,n = § for n = 1, . . . , d. 
Note that, although diagonal, the diagonal entries of this A are not equal. 
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It is not possible to correct this by simple coordinate-wise scaling of W as 
this will change S only and leave A unaffected; see also the discussion in [51 
Section 5]. Hence, again, the generality of (jl.ip is essential here. 

4. Complete non-degenerate [/-statistics 

Using the exchangeable pairs coupling, Rinott and Rotar [TD] proved a 
univariate normal approximation theorem for non-degenerate weighted U- 
statistics with symmetric weight function under fairly mild conditions on 
the weights. Using the typical coupling, where uniformly a random variable 
Xi is choosen and replaced by an independent copy, they show that p.ip 
is satisfied for the one-dimensional case and a non-trivial remainder term, 
being Hoeffding projections of smaller order. It should not be difficult (but 
nevertheless cumbersome) to generalise their result to the multivariate case, 
where d different [/-statistics are regarded based on the same sample of in- 
dependent random variables, such that (jl.ip is satisfied with A = / and 
non-trivial remainder term, again of lower order; for multivariate approxi- 
mations of several [/-statistics see also the book by Lee [4]. However, as we 
want to emphasize the use of Theorem 12.11 for non-diagonal A, we take a 
different approach. 

Let Xi, . . . , Xn be a sequence of i.i.d. random elements taking values in a 
measure space X. Let ^ be a measurable and symmetric function from X'^ 
to IR, and, for each k = 1, . . . ,d, let 

MXi,...,Xk) :=E(V'(Xi,...,Xrf) I 

Assume without loss of generality that lS^p{Xl, . . . , Xd) = 0. For any subset 
a C {1, . . . , n} of size k write ipkioi) ■= ipi^h ) • • • > ^it) where the ij are the 
elements of a. Define the statistics 

Uk ■■= ^ i^k{a), 

\a\=k 

where X^^(q) denotes summation over all subsets a C {1, . . . , n} which sat- 
isfy the property E. Then Ud coincides with the usual [/-statistics with 
kernel ip. Assume that Ud is non-degenerate, that is, F['0i(^i) = 0] < 1. 
Put 

It is well known that VarWfc x 1 (e.g. [4J). Note also that, as n — > oo, 
S := 1E(M^H^*) will converge to a covariance matrix with all entries equal to 
Var?/)i(Xi) and which is thus of rank 1, as we assume non-degeneracy and 
hence Ui = X]r=i "^li^i) '^iH dominate the behaviour of each Uk- 

Using Stein's method and the approach of decomposable random vari- 
ables, Raic |7j proved rates of convergence for vectors of [/-statistics, where 
the coordinates are assumed to be uncorrelated (but nevertheless based upon 
the same sample Xi, . . . The next theorem can be seen as a comple- 

ment to Raic's results, as in our case from above, a normalization is not 
appropriate. 
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Theorem 4.1. With the above notation, and if p := Mip{Xi, . . . < oo 

we have for every three times differentiable function h 

\miW) - E/i(Si/2^)| ^ n-5 (Ap^/^d^\h\2 + p^/W^\h\ 



Proof. Let X[,...,X'^ be independent copies of Xi,...,Xn- Define the 
random variables ipjj^{a) analogously to V'fc(Q^) but based on the sequence 
Xi, . . . , Xj-i, Xj, Xj^i, . . . , Xn- Define the coupling as in [10], that is, 
pick uniformly an index J from {1, . . . ,n} and replace Xj by Xj, so that 
U'f^ = l^|a|=fc V'jA;(ct); it is easy to see that {U',U) is exchangeable. Note 
now that, if j a, ~ ''Pkict), and, with X = that 

E^V-,fc(a) = ^k-i{a \ {j}) if J G a. Thus 

= 1 \a\ = k, 

1 " 



(4.1) 



.^1 
n 



j = l |a!|=fc, 
a3j 



|/3|=fc-l 



n 



k n-k+1 

- — Uk H (^fc-l) 

n n 



where the second last equality follows from the observation that 



ja|=fc, 



|/31 = fc-l, 
0^3 



and thus, in the corresponding double sum of (|4.ip . every set /3 of size k — 1 
appears exactly n — {k — 1) times. Thus 

^""{Wl - Wk) = --{Wk - Wk-i). 
n 



Hence, (jl.ip is satisfied for i? = and 

1 



A 



1 

n 



-3 3 



-d d 



with lower triangular A such that, if / ^ /c 



thus, for / = 1, . . . , d, 
(4.2) 



(A ^)k^i = n/l, 
A(') ^ dn. 
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Define now rjj^i^^a) := ip'j i^{c() — ipk{oi)- Then we have for every k^l 
(4 3) E^'^'Kf/^ -[/,)([// -C/,)} = i^( V,,k{aH,m) 

j = l ^ |Q| = fc,|;3|=I, ^ 

and 



i,j = l |a|=fe,|/3| = i, |7|=fc,|5|=i, 



Note now that, if the sets aU(3 and 7U5 are disjoint (which can only happen 
iH/i), 

(4.5) m{ri,^k{a)rii^k{P)Vi,l{l)'nj,l{^)] = '^{rh,k{a)r]i^k{0)]'^{r}jMVj,l{^)] 

due to independence. The variance of (|4.3|) . that is (j4.4p minus the square 
of the expectation of (j4.3p , contains only summands where a U /? and 7 U 5 
are not disjoint. Recall now that p = EV'(^i, • • • Bounding all the 

non-vanishing terms simply by 32/9, it only remains to count the number of 
non- vanishing terms. Thus, 

anffSi 7nc53j,(7U5)n{aU/3)7^0 

= iE E (E E E E 

i=l |a|=fe,l/3|=i, |7l=fe,|i|=i, J0aU/3 |7l=fc,l«l=i, 

an/33i 7n<53i 7n(53 j,{7Ui )n (qU/3)5^0 



where the equality is just a split of the sum over j into the cases whether or 
not j £ aUp. In the former case we automatically have (aU/?)n (7U5) 7^ 0. 
It is now not difficult to see that 

32p(A; + / - 1) /n - 1\ Vn - 1^ ^ 



^^•''^ n \k-lj V/-1 

Noting that, for fixed j, k, I, a and /3, 

{|7| = A:,|5| =/ : 7n,5 9 j,(7U,5)n(QU/3) /0} 
= {I7I = = / : 7n5 9 i} 
\ {I7I = /c, |5| = / : 7n 5 9 i, (7U (5) n (aU /3) = 0}, 

we further have 

32p(n — 1) /n — 1\ /n — 1^ 



n-l\/n-l\ /n-A;-/ + l\/n-A;-/+l 
fc-lju-1 j ~ V A:-l jl /-I 
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where we also used that ("~^'"^^') ^ {"^^k-i^^)- "^^^ fohowing statements 
are straightforward to prove: 

(«' (::;)(:)"'^^- 

(4.7) 

n-k-l + l\fn\''^ ^ k /n-2k-l + 3\'' ^ k / k{2k + I - 3) 



k — 1 J \ k ) n\ n j n\ n 

Thus, from ()4.6p . 

, 2M"V"\"^. 32p(k + l-l)k^P 64pd^ 

From dMI) and 



n\ n 



-2 



32pk'^f {k{2k + l-3)+l{k + 2l- 3)) 192pd6 



k) \l J ' 
Thus, for ah k and I, 

VarE^(Ty^ - Wk){W[ - Wi) ^ VarE^'^'(Ty^ - Wk){W[ - Wi) 
(4-9) ^ 256pd^ 

Notice further that for any m = 1, . . . ,d, 

1 " I 



n 

j = l |a| — — |7|— 

anpr\-f3j 



\m —11 



using ()5.3p : hence, along with (j4.6p . 
(4.10) 

E|(W,' - VF,)(VF^ - Wk){Wl - Wi)\ ^ max E|VC - VF^I^ 

m=j,A;,/ 

^8p3/V/2 max f-yV--!'' 
m=i,k,l \m J \m — 1 

^ 8/.3/4^3„-3/2_ 

Applying Theorem O with the estimates (g^l), and (fiTOj) proves the 
claim. □ 

Remark 4.2. Using the operator norm as used by Meckes [5] we would be 
able to achieve a bound of n\og{d + 1) instead of ()4.2p . but using bounds 
for the total derivatives of the test functions, sup^gj^t ||D''/i(x)||op, instead 
of bounds for |/iL. 
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5. Edge and triangle counts in Bernoulli random graphs 

Typical summaries for random graphs are the degree distribution and 
the number of triangles, as a proxy for the clustering coefficient in a random 
graph, which is the expected ratio of the number of triangles over the number 
of 2-stars a randomly chosen vertex is involved with. Conditional uniform 
graph tests are based on fixing the degree distribution and randomising over 
the edges, conditional on keeping the degree distribution fixed. Our next 
example shows that even when fixing only the number of edges, not even 
the degree distribution, under a normal asymptotic regime the number of 
triangles, and the number of 2-stars, is already asymptotically determined. 
Let G{n,p) denote a Bernoulli random graph on n vertices, with edge prob- 
abilities p; we assume that n > 4 and that < p < 1. Let lij = Ij^i be the 
Bernoulli(p)-indicator that edge is present in the graph; these indica- 
tors are independent. Our interest is in the joint distribution of the total 
number of edges, described by 



2 /_^''-'J / 



and the number of triangles. 



i,j,k distinct i<j<k 

Here and in what follows, ^H,j,k distinct" is short for "{i,j,k) '■ i 3 
k / z"; later we shall also use ^H,j,k,£ distinct", which is the analogous 
abbreviation for four indices. 

In view of the embedding method, we also include the auxiliary statistic 
related to the number of 2-stars, 

^•=2 ^ ^hi^oM = {Ii,jlj,k + h,jli,k + Ij,kli,k)- 

i,j,k distinct i<j<k 

We note that 



ET = ( '^^ ]p; EF = 3( '^jp^, and EC/ = l '^ ]p^. 



With some calculation, as detailed in Section [5.11 we find that the variances 
are not all of the same order. Hence, we re-scale our variables (c.f. [3j), 
putting 

n — 2 1 1 

Ti = -^T, Vi = -^V, and C/i = -^U. 

For these re-scaled variables the covariance matrix Si for Wi = (Ti — 

ETi, Vi - EFi, Ui - EC/i) equals 

(5.1) 

/ 1 2p p2 



(n-2)(") , 
Si = 3^ J^^p{l-p) X 



\ P ^ n-2 P ^ 3(n-2) 
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Remark 5.1. With n ^ oo we obtain as approximating covariance matrix 

1 / 1 2p p2 N 

(5.2) So = -p(l-p) X 2p 4p2 2p3 . 

As also observed in [3], this matrix has rank 1. It is not difficult to see that 
the maximal diagonal entry of the inverse S"^ tends to oo as n ^ oo, so 
that a uniform bound on the square root of as suggested in Remark[231 
will not be useful. 

Janson and Nowicki [3] derived a normal limit for Wi, but no bounds on 
the approximation are given. Using Theorem 12. II we obtain explicit bounds, 
as follows. 

Proposition 5.2. Let Wi = (Ti - Eri,Vi - EVi,C/i - WJi) he the cen- 
tralized count vector of the number of edges, two-stars and triangles in a 
Bernoulli{p) -random graph. Let Si he given as in (|5.1|) . Then, for every 
three times differentiahle function h, 

\m{W) - E/i(sf Z)| ^ ^ (f + 9n-i) + ^ (1 + + n-') . 

While we do not claim that the constants in the bound are sharp, as we 
have (2) random edges in the model, the order 0{n~^) of the bound is as 
expected. While for simplicity our other bounds are given as expressions 
which are uniform in p, bounds dependent on p are derived on the way. In 
this example, we were not able to obtain any improvement on the bounds 
using the operator bounds [5]. 

Proof. The proof consists of two stages. Firstly we construct an exchange- 
able pair; it will turn out that ii = in (jl.ip and hence C in Theorem 12 . 1 1 will 
vanish. In the second stage we bound the terms A and B in Theorem 12.11 

Construction of an exchangeable pair 

Our vector of interest is now W = {T - ET, V - EV, U - E[/), re-scaled to 
Wi = (Ti - ETi, 1^1 - EFi, C/i - EC/i). We build an exchangeable pair by 
choosing a potential edge {i,j) uniformly at random, and replacing Li j by 
an independent copy L'^ j. More formally, pick (/, J) according to 

F[I = i,J = i] = l<i<j^n. 
[2) 

li L = i, J = j we replace lij = Lj^i by an independent copy L'^ j = L'- ■ and 
put 

r = T-iLjj-Llj), 

V' = V- + 

U' = U- {Lij-I'jj)lj^kll,k- 
k:kjtl,j 
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Put W = (T'-ET, y'-EF, f/'-EC/). Then iW, W) forms an exchangeable 
pair. We re-scale W as for W to obtain T[, V{ and U[, so that {Wi, W{) is 
also exchangeable. 

Calculation of A 

For the conditional expectations E'^(M^' — W), firstly we have 

n-2 2(n - 2) ^ 1 , 

Furthermore 

= -2^(yi - EFi) + 2p^(ri - ETi), 

where the last equality follows from 'EiiV'l — Vi) = 0. Similarly, 

-E^([/i - C/i) = -3^(C/i - EC/i) - EFi), 

Using our re-scaling, is satisfied with R = and A given by 

1/1 

^ = M 2 

Y -p 3 

Bounding A 

The inverse matrix A~ is easy to calculate; for A« = J2t=i \{^~^)m,i\, for 
simplicity we shall apply the uniform bound 

|A»K^n2, i = 1,2,3. 

As the bounding of the conditional variances is somewhat laborious, most 
of the work can be found in Section [5.21 The conditional variances involving 
T' — T can be calculated exactly. As ifj = li^ 



E^(r'-T)2 = -i^^E^(I^-/,, 



J' 

i<j 



= M - P^'^^^^^ + (1 - P)^'^hj} = P + (1 - 2p)y^T, 

{2) i^j [2) 

so that with Var T given in ([53]), Var(E^(r' - r)^) = .^(l - 2p)'^p{l - p) 
and 



Var(E^(ri' - T,f) = ^^^^(1 - 2pfp{l - p) < n'^ 

\2) 
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where we used that p{l — p) ^ 1/4 for ah p. Thus 



'Var(E^(T{ -Ti)2)) < n'^ 

Similarly we obtain 



^Var E^(r{ - Ti)(yi - Vi) < 2n-^ 
and 

^VarE^(r{ - Ti){U[ - Ui) < IrT^ . 

The conditional variances involving V and U only are more involved; the 
calculations are available in Section [5.21 We obtain after some calculations 
that Var(E'^(y' - Vf) ^ SSn^, and hence 



^Var(E^(V\' - Vxf) < Qn-^. 
For E^(y/ -Vi){U[-Ui), analo ffous calculations lead to 



Var (E^(y/ - Vi){U{ - Ui)) < n'^ + 11 
Finally, again using our variance inequalities, 

^Var (E^([/( - UiY) < 5n~^ + 271"! 



n 



Collecting these bounds we obtain for A in Theorem 12. II that 

A < 35n-^ + 36n-2. 

Bounding B 

We use the generalized Holder inequality 

3 3 

(5.3) eHI^^I ^II'l^I^^I^^^ ^ ™^ ElXip. 

i=l i=l «-l,2,3 

Again the complete calculations are found in Section 15.31 To illustrate the 
calculation, 

E|r' - r|3 = -L ^ - 4,/ = 2^(1 -p)^\, 

SO that 

E|r,'-ri|3 = l!i-^2p(i-p)< V^. 

Similar calculations yield , 
as well as 

27 

Thus for B in Theorem 12.11 we have 

5 < X 9 X ^ {n~^ + n~'^ + n~^) = 32 (n^^ + n"^ + n"^) . 
Collecting the bounds gives the result. □ 



14 



GESINE REINERT AND ADRIAN ROLLIN 



Remark 5.3. Had we not introduced V, conditioning would yield 

2 



i<j k:k=^i,j 



n(n — 1) nin — 1) 

The expression ^j<;j j lEr'"'^ Ij^klj,k would result in a non-linear re- 
mainder term R in Equation (jl.ip . The introduction of V not only avoids 
this remainder term, indeed = in (jl.ip . but also yields a more detailed 
result. This observation that the 2-stars form a useful auxiliary statistic can 
also be found in [3]; there it is related to Hoeffding-type projections. 



Using Proposition 12. 2j we also obtain a normal approximation for Eq 
given in ()5.2p . 

Corollary 5.4. Under the assumptions of Proposition [5^ for every three 
times differentiahle function h, 

\Eh{W) - E/i(Eo/^Z)| ^ ^ (44 + 21n-^ + 32n-2 + An'^) 



+ 



3n 



n 



-2\ 



Proof. We employ Proposition 15.21 and Proposition 12.21 with the triangle 

inequality. A straightforward calculation shows that ^ 

2n~^ and so 



El- 



+ 



+ 2n"^^ {l + 4p + + 4p^ 



n-2 ' n-2 ' 3(n - 2) 
< 26n~^ + 3n-2 + 32n-^ + 4n-^. 



-n"^ + 2n"^ + 1 



Here we used the crude bound that (n — 2) ^ ^ |n ^. The corollary follows. 

□ 



5.1. Calculation of the covariance matrix. To calculate the covariance 
matrix S, we put 

as the centralized edge indicator, and similarly we centralize 

T = Yii,j, V =^ ^ iijij^k u = ^ iijij^kii,k- 

i<j i,j,k distinct i<j<k 

Then, by independence, all these quantities have mean zero. For the vari- 
ances, the expectation of the product of centralized indicators vanish unless 
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all the centralized indicators involved are raised to an even power. Hence 
(5.4) 

Varr= Qp(l-p), VarV = 3(^^p\l-p)\Y^iTU=(^^p''{l-pf. 

Moreover, for the same reason, all covariances between the centralised vari- 
ables vanish. Expressing T, V and U, we have T = T — ET so that 

r = f + Er = r+ (^p 

and 

(5.5) Varr= Qp(l-p) = 3Q-l^p(l-p). 
Next, V = V - 2p{n - 2)T + 3p^ (3) , so that 

V = V + 2{n- 2)pf + 3Qp2^ 

As V and T are uncor related, this gives that 

(5.6) Vary = 3^^2(1 - p){l - p + 4(n - 2)p}. 

For [/, we have U = U —pV +p'^ {n — 2)T —p^ . Using the above expressions 
(|5.ip and (j5.ip for T and V we obtain 

U = tj + pV + p^{n - 2)f + p^ 

This gives for the variance 

(5.7) Var;7= Q/(l - p) { (1 - p)^ + 3p(l - p) + 3(n - 2)/} . 

We can now also calculate the covariances. Again we use that the centralized 
variables are uncorrelated to obtain 

Cov(r,y) = Cov (f ,V + 2{n - 2)pT^ = 2(n - 2)p Var(f ) = 6^/(1 -p). 

Similarly, we calculate that Cov(T, [/) = 3(3)^^(1 — p), and Co'v{V,U) = 
3(3)^^(1—^3) {1 — p + 2(n — 2)p). Re-scaling gives the covariance matrix ()5.ip . 

5.2. Calculation of the conditional variances. For the conditional vari- 
ances in the random graph example, the calculations are somewhat involved. 
We repeat the first calculation in more detail before moving on to further 
bounds. With (|5T]1 . 

^^{T' -T){V' -V) 



-7^ E ^'^{ll.-h.fihk + I, 
l^(-2{n-2)pT-2{l-2p)V) 



k) 
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SO that 

VarE^(r' - T){V' - V) = 

^^^^^j^pil - p) {{n - 2)p2(3 - 4pf + (1 - 2pfp{l - P)} < 4, 
[2) 

where we used that p^{l — p) ^ ^ and that n > 4. Thus 

^VarE^(T'-T)(y'-y) < 2n~l 
Similarly, with ()5.ip . 

E^(r'-r) ([/'-[/) 

= -^{pV + 3{l-2p)U). 

Thus we calculate that 

VarE^(r' - T){U' - U) 
=^P'(1 - P) (3(1 - 2p)2(l - pf + p(l - p)(4 - 6pf + {n- 2)/(5 - 

and, using that p(5 — 6p) ^ || and — p) ^ we obtain 



^VarE^(r' -r)(C/' - U) < 
For VarE'^(y — V)"^ we introduce the notation 



(5.8) 






^ ^ Ii,klk,j- 


Then 








(5.9) 


T = 


i 




(5.10) 


V = 




i 


(5.11) 


U = 


«7^i 
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We have 



= -^Le'^ (4(n - 2)(y + T) - 8T + ST^ - 161/)) 

+ (1 - 2p)E^ [2 ^ J A^f _ ST + 2 ^ ^ijA^.A^i - 16^)) [ , 

where we used (|5.9|) and (|5.1U|) for the last equation. To simphfy this ex- 
pression, note that Y.i = 2T + 2V , and Y^i^j ^i^j = 47"^ - 2T - 2^ as 
well as 

^ jivf = ^ ii,jii,kii/ + + 2T, 

i^j i,j,k,t distinct 



and 

so that 
7.W nr' 



J2 kJN^Nj = Yl kikkiji + ^v + m + 2T, 

ij^j i,j,k,l distinct 



{V' -vy = —\ 2p{n - 4)r + 2V{np - lOp + 2) + 6(1 - 2p)U + 4pT' 
[2) 



i,j,k,i 
distinct 

With the notation T for the centralized variable, we have that 
VarE^(y'-F)2 

< < /(2n - 8 + Apn^ - 4pnf Var(r) + A{np - lOp + 2)^ Var(y) 



+ 36(1 - 2p)2 Var(C/) + 16^^ Var(f 2) 

+ (l-2p)2Var( Yl ^^kjkkiki + ke))]^ 



distinct 



where we used that in general Var Yli=i ^ Si=i -^i ^'^'^ ^5.4p . Here, 
the variances for T, V and U are given in ()5.5p , (j5.6p , and ()5.7p . To simplify 
the expression, we use that p^{l — p) ^ 27/256 to bound 

.2/o„ O , A 2 A \2^r„(rr\ ^ 27/n\ O/ , „n2 



p^(2n - 8 + 4pn - 4pn)^ Var(r) ^ — I ]n\n + 2Y. 

64 \ 2 , 
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Similarly, we bound with p^{l — p) ^ 4/27 and n > 4 

4(np - lOp + 2)2 Var(y) ^ ^n^(n - l)(n - 2)(n + 1), 



and 



81 

36(1 - 2j))2 Var(m ^ n(n - l)(n - 2)(3n + 2). 

256 



We note that ^Ii,jiu,vls,tlk,e = unless either all pairs of indices are the 
same, or the product is made up of two distinct index pairs only. Hence 

Var r2 = ^ ^ ^ ^ mJuJsAe 

i<j u<v s<t k<i 



^ p{i-p) \?>[[^]-i]p{i-p) + {i-pr+p' 



givmg 



27 

lep^VarT^ ^ n^{n-l). 



For the last variance term, we use that conditional variances can be bounded 
by unconditional variances, giving 

^ ^ Y&llijli,k{li,e + Ij,l) 

i,j,k,e 

distinct 

+ E E mi,j,k,i)^{r,s,t,u)) 

i,j,k,£ distinct r,s,t,u distinct 

X l{\{i,j,k,£}n{r,s,t,u}\ > 2) 

X Co\{IijIi^k{Ii/ + Ijl),Ir,sIr,t{Ir,u + L,u))) 

Here we used the independence of the edge indicators. For the last bound 
we employed that p^{l - p^) ^ 1/4, that p^il - p^) ^ (\/3 - l)/3, and that 
n > 4. Collecting the variances and using that n > 4, 

Var(E^(y'-y)2) 

- - 2)(3n + 2) + ^n3(n - 1) + 3n2 (^^ 

This gives that 



^Var(E^(F/ - ^1)2) < Qn 
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For E^(y' - V){U' - U), we have, 
- V){U' - U) 

= -pT - 6(1 - P)U + (1 - 2p) J2 ^'^I^,JNiMi^ . 

where we used (jS.lip . Now 



Similarly, 



distinct 



i^j i,j,k,l 

distinct 



so that 

¥}^{V' -V){U' -U) = ^(2pV + Q{l-2p)U+p ^ h,kh,jli/ 



i,j,k,i 
distinct 



+ (1 - 2p) Y ^i,i^i,khih,j j • 



i,j,k,i 
distinct 



Furthermore, as before, 



Var Yl ^^hkhjki 

i,j,k,i distinct 



Similarly as for ()5.2p . 

Var ^ ]E^IijIi^kIi/Ij,e ^ Var ^ lijli^kli/lji 

distinct distinct 

^ (4) +6(2^^(1-/ 
n\ / 1 1 / n 



^ ,47 V256 ^ 16 V2 



As p < 1, we obtain that 
VarE^iV -V)iU' -U) 



so that 



Var (E^(y/ - )([/{- C/i)) < n~3 + lln" 
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Finally, 



E^(C/' - U)' = :^^{P^'^MI^ + (1 - 2p)E^I,,M5). 



We have that 



k:kj^i,j £:£^i,j k:k^i,j £:£j^i,j,k 



and 



^ijMfj = lijMij + ^ ^ lijli^khjli/hj, 

k:kjti,j £:£y^i,j,k 



SO that 

([/' - Uf = -L-S2pV + 6(1 - 2p)C/ + p Yl ^'^kklkjh£l£,j 



2 2 I 



distinct 



+ {l-2p) E^I^jI^,kIk,jIi,£hA- 



As for (|5.2p . we obtain 



Var E^/,,fc4,,/,,,I,- £ < L 

i,j,k,£ \ / \ 



distinct 



^(l-/) + 6(2]/(l-/^ 



distinct 

and 



distinct 



Again using our variance inequalities, we thus obtain that 
Var (E^(C/' - Uf) 

+ 36(1 - 2p)2((n - 2)^2 + 1(4 _ 5p + 



+ /( J (/(l-p')+6( Jp^(l-p'' 



+ {i-2prr] p^(i-p^)+6r]p\i-p^) 



^ 22 + 2n^ 



so that 

Var(E^(C/' - C/)2) < Sn"^ + 271""^. 
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5.3. Calculation of the third moments. Firstly, ]E\T'-T\^ = Y.i<j 
= 2p{l-p) < i, so that 

E|r,'-ri|3 = ^^^^2p(l-p)<ln-l 

Similarly, 

E|y' - vf 

= 2p{l-p){n-2)x 

X (8/ + 2p{l-p) + 2(n - 3)(2p2 + 2/) + 8(n - 3)(n - 4)/) , 

so that 



Lastly, 



so that 
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